Universal Resource Locators (URLs)

HTML documents use a standard format to refer to names of graphics files and other HTML documents. The Internet defines something called a Universal Resource Locator (URL) that is used to refer to resources (primarily files), and this is used by Web browsers as well. A typical URL is http://www.winmag.com/common/feedback.htm.

Don't let the high-falutin' name fool you. A URL is basically just a file name. It tells the browser the server, directory, file name, and extension, so that the browser knows how to retrieve the file.

URL Anatomy

The basic form of a URL is

protocol://domain/directory/name.ext#anchor

Here's what all the pieces mean:

protocol The communication protocol that should be used to retrieve the data. There are many possible protocols, but here are the ones you are most likely to see (and use):

http HyperText Transfer Protocol. The most common protocol used with the Web, and also the default protocol in most cases.

https Secure HyperText Transfer Protocol. A version of HTTP that encrypts the data so that it can't be (easily) stolen. This is what you want when you send your credit card number to a web site.
ftp File Transfer Protocol. Most often used for downloading files from archiving sites onto the local hard disk.

file This means the URL is actually a file on the local hard disk. You generally wouldn't put this in your HTML files, but you may see it if you use a browser's File/Open command to open a file from the local disk.

domain The domain name for the server where the resource (file) lives. This is something like whitehouse.gov or www.winmag.com. There's no requirement that a domain name for a Web site start with "www" but it's a common convention.

directory Directory on the web server where the resource is located. If the resource is in the "home" directory there won't be a directory name.

name.ext The name of the resource (a file, usually). The name doesn't need to be limited to the 8.3 DOS file names. The actual name limits depend on the files on the Web server, not on the browser requesting the file.

anchor A location in the HTML file that should be displayed. In a large document, this lets you jump to a point other than the top of the file.

Not all the components of a URL are needed for every reference. If you don't specify the protocol and domain, the browser assumes you want to use the same protocol and domain as the HTML file it currently has loaded. If you don't use an anchor, the browser will put you at the top of the file.

If you don't specify a file name the Web server usually chooses a default file name such as default.htm or index.html. The actual default name of the file is determined by the server, not the browser.

Examples of URLs

Let's say the web browser has the file http://www.winmag.com/news/default.htm loaded. Here are a few examples of URLs and what will happen they're used as links and a user clicks on them:

`http://www.yahoo.com/`	Loads the default page from the Yahoo server. The trailing slash could have been left off.
`../common/sitemap.htm`	This is a "relative URL" that loads the page `http://www.winmag.com/common/sitemap.htm`. The ".." in the URL says to go up one level in the directory structure; that takes you from "/news" to "/".
`/common/sitemap.htm`	Another relative URL that loads the page `http://www.winmag.com/common/sitemap.htm`. Since the protocol and domain weren't specified, the ones for the current page are used.
`oldnews.htm`	Still yet another relative URL that loads `http://www.winmag.com/news/oldnews.htm`.
`#RegistryInfo`	Moves to the anchor point named "RegistryInfo" in the current page. If there isn't such an anchor, most browsers will just jump to the top of the page.

Absolute vs. Relative URLs

As you build Web pages, you'll often have the choice between using relative or absolute URL references. If you're building pages on your own hard disk, relative references have quite a few advantages. You can put your files in one or more directories in your own hard disk, build and test them, and then move those directories to your server without having to change the file names.

If you use absolute URLs to pages that you're in the process of building on your local hard disk, you won't be able to test those links locally. Still, there are pluses to absolute URLs. If someone copies a file off your web server onto their hard disk and later views it, the absolute URL references will still go to the right location. Relative references will break unless the person copied the referred-to files to their local disk as well.

I generally favor relative references to the greatest extent possible, unless you know the file will often be copied and used for local browsing. Even then, you may find it's best to create a specially-edited package of files that people can download. That way they'll see exactly what you intended, graphics and all.

[Go back to Contents page]